TCGA2STAT: simple TCGA data access for integrated statistical analysis in R

نویسندگان

  • Ying-Wooi Wan
  • Genevera I. Allen
  • Zhandong Liu
چکیده

MOTIVATION Massive amounts of high-throughput genomics data profiled from tumor samples were made publicly available by the Cancer Genome Atlas (TCGA). RESULTS We have developed an open source software package, TCGA2STAT, to obtain the TCGA data, wrangle it, and pre-process it into a format ready for multivariate and integrated statistical analysis in the R environment. In a user-friendly format with one single function call, our package downloads and fully processes the desired TCGA data to be seamlessly integrated into a computational analysis pipeline. No further technical or biological knowledge is needed to utilize our software, thus making TCGA data easily accessible to data scientists without specific domain knowledge. AVAILABILITY AND IMPLEMENTATION TCGA2STAT is available from the https://cran.r-project.org/web/packages/TCGA2STAT/index.html SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online. CONTACT [email protected].

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Big linked cancer data: Integrating linked TCGA and PubMed

The amount of bio-medical data available on the Web grows exponentially with time. The resulting large volume of data makes manual exploration very tedious. Moreover, the velocity at which this data changes and the variety of formats in which bio-medical data is published makes it difficult to access them in an integrated form. Finally, the lack of an integrated vocabulary makes querying this d...

متن کامل

Fostering Serendipity through Big Linked Data

The amount of bio-medical data available over the Web grows exponentially with time. The large volume of the currently available data makes it difficult to explore, while the velocity at which this data changes and the variety of formats in which bio-medical is published makes it difficult to access them in an integrated form. Moreover, the lack of an integrated vocabulary makes querying this d...

متن کامل

RTCGAToolbox: A New Tool for Exporting TCGA Firehose Data

BACKGROUND & OBJECTIVE Managing data from large-scale projects (such as The Cancer Genome Atlas (TCGA)) for further analysis is an important and time consuming step for research projects. Several efforts, such as the Firehose project, make TCGA pre-processed data publicly available via web services and data portals, but this information must be managed, downloaded and prepared for subsequent st...

متن کامل

Performance Improvement of Expanded Integrated Local Area Networks (RESEARCH NOTE)

In Local Area Networks (LAN) connected together by bridges, flow control and smooth traffic in the network is very important. However, congestion at bridges can cause intensive loss of received frames. In addition, the received frames are thrown away and have to be retransmitted by the source station, which causes more congestion and massive reduction in the overall network throughput. The netw...

متن کامل

Rock Brittleness Prediction Using Geomechanical Properties of Hamekasi Limestone: Regression and Artificial Neural Networks Analysis

The cold climate is a favorable parameter for the development of tension cracks and decrease of rock brittleness. Therefore, this paper attempts to investigate the Hamekasi porous limestone in order to predict the brittleness indices during freeze-thaw cycles. The freeze–thaw test was executed for one cycle including 16 h of freezing, and 8 h of thawing. The geo mechanical properties and brittl...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Bioinformatics

دوره 32 6  شماره 

صفحات  -

تاریخ انتشار 2016